Incident Management (IcM) refers to the activities of an organization to identify, analyze and correct hazards. For instance, a fire in a factory would be a risk that realized, or an incident that happened. An Incident Response Team (IRT) or an Incident Management Team (IMT), specifically designated for the task beforehand or on the spot, would then manage the organization through the incident.
Usually as part of the wider management process in private organizations, incident management is followed by post-incident analysis where it is determined why the incident happened despite precautions and controls. This information is then used as feedback to further develop the security policy and/or its practical implementation. In the USA, the National Incident Management System, developed by the Department of Homeland Security, integrates effective practices in emergency management into a comprehensive national framework.
Contents |
A specific case of incident management is computer security incident management, which is most often handled by a Computer Security Incident Response Team (CSIRT). For example, if an organization discovers that an intruder has gained unauthorized access to a computer system, the CSIRT team would analyze the situation, determine the breadth of the compromise, and take corrective action. Computer forensics is one task included in this process.
Incident management can be defined as : “Incident Definition as per V3” An unplanned interruption to an IT Service or a reduction in the Quality of an IT Service. Failure of a Configuration Item that has not yet impacted Service is also an Incident. For example, Failure of one disk from a mirror set. An “Incident Definition as per V2” An event which is not part of the standard operation of a service and which causes or may cause disruption to or a reduction in the quality of services and Customer productivity. The objective of incident management is to restore normal operations as quickly as possible with the least possible impact on either the business or the user, at a cost-effective price.
The Incident Manager is a functional role and not a position.Incident management provides to the external customer a focal point for leadership and drive during an event by ensuring adherence to follow-up on commitments and adequate information flow. This means, presenting to the customer an entity that accepts ownership of their problem.
The objective of Incident Management during an incident is service restoration as quickly as possible. The objective is not to make a system perfect. If service can be restored by a temporary workaround quicker than by correcting the underlying root cause of the issue then that is acceptable. After service restoration, correction of underlying root causes is done by the Problem Management team by a process called Root Cause Analysis (RCA). An example of service restoration by temporary workaround is that done on the Apollo 13.
The primary focus of Incident Management is to ensure a prompt recovery of the system, supervising and directing the internal or external resources. Prompt system recovery and minimization of any impact to the customer’s, has priority over unreasonably long and intensive data collection for the event root cause investigation.
Incidents can be classified into three primary categories: Software (applications), hardware, and service requests. (Note that service requests are not always regarded as an incident, but rather a request for change. However, the handling of failures and the handling of service requests are similar and therefore are included in the definition and scope of the process of incident management.)
ITIL separates incident management into six basic components:
From ITIL point of view, the activities of Incident Management are:
An Incident Manager should be able to:
Incident management software systems are designed for collecting consistent, documented Incident report data. Many of these products include features to automate the approval process of an incident report or case investigation. Additionally incident report systems will automatically send notifications, assign tasks and escalations to appropriate individuals depending on the incident type, priority, time, status and custom criteria. Modern products provide the ability for administrators to configure the Incident report forms as needed, create analysis reports and set access controls on the data.